File Hash calculation
The File Hash acts as a "digital fingerprint" to uniquely identify each file. While processing, by default, Epiq Discovery calculates the File Hash using one of the following methods and populates the File Hash field.
-
When email or ICS (Calendar) information is available, Epiq Discovery calculates using the following properties:
-
Date
-
To (count)
-
From
-
CC (count)
-
BCC (count)
-
Subject
-
Body Text
-
File hash for email and calendar items.
-
-
When email or ICS information is not available, Epiq Discovery calculates the binary hash. For Microsoft Teams chat MSG files, the system calculates the binary hash.
Custom File Hash calculation
For ICS files and email files, such as MSG and EML, you can specify the fields for File Hash calculation. This feature enables the deduplication of identical emails collected and processed from various platforms, such as an email archive and active email server. The system does not support the Custom File Hash calculation for the following documents.
-
Parent ICS files that are collected using the Collect>Microsoft 365 option
-
Bloomberg email files
-
Microsoft Teams chat MSG files
By default this feature is disabled. You can enable this feature in the Project Settings. When you apply Custom File Hash calculation, the system populates the File Hash and File Hash – Original fields. Then, the File Hash field displays the Custom File Hash value and the File Hash – Original field displays the default File Hash value. When you do not apply Custom File Hash, the system populates only the File Hash field.
The following list provides related topics.
-
To enable Custom File Hash calculation, refer to Modify Custom File Hash calculation setting.
-
For more information about how the Processing Information field is coded for the File Hash calculation, refer to Processing Information.